Search CORE

20 research outputs found

Recommended from our members

Sparse representation matching for person re-identification

Author: An Le
Bhanu Bir
Chen Xiaojing
Yang Songfan
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

eScholarship - University of California

Multimodal Learning For Classroom Activity Detection

Author: Ding Wenbiao
Huang Gale Yan
Kang Yu
Li Hang
Liu Zitao
Yang Song
Yang Songfan
Publication venue
Publication date: 10/02/2020
Field of study

Classroom activity detection (CAD) focuses on accurately classifying whether the teacher or student is speaking and recording both the length of individual utterances during a class. A CAD solution helps teachers get instant feedback on their pedagogical instructions. This greatly improves educators' teaching skills and hence leads to students' achievement. However, CAD is very challenging because (1) the CAD model needs to be generalized well enough for different teachers and students; (2) data from both vocal and language modalities has to be wisely fused so that they can be complementary; and (3) the solution shouldn't heavily rely on additional recording device. In this paper, we address the above challenges by using a novel attention based neural framework. Our framework not only extracts both speech and language information, but utilizes attention mechanism to capture long-term semantic dependence. Our framework is device-free and is able to take any classroom recording as input. The proposed CAD learning framework is evaluated in two real-world education applications. The experimental results demonstrate the benefits of our approach on learning attention based neural network from classroom data with different modalities, and show our approach is able to outperform state-of-the-art baselines in terms of various evaluation metrics.Comment: The 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020

arXiv.org e-Print Archive

Crossref

Learning Multi-level Dependencies for Robust Word Recognition

Author: Huang Gale Yan
Liu Hui
Liu Zitao
Tang Jiliang
Wang Zhiwei
Yang Songfan
Publication venue
Publication date: 21/11/2019
Field of study

Robust language processing systems are becoming increasingly important given the recent awareness of dangerous situations where brittle machine learning models can be easily broken with the presence of noises. In this paper, we introduce a robust word recognition framework that captures multi-level sequential dependencies in noised sentences. The proposed framework employs a sequence-to-sequence model over characters of each word, whose output is given to a word-level bi-directional recurrent neural network. We conduct extensive experiments to verify the effectiveness of the framework. The results show that the proposed framework outperforms state-of-the-art methods by a large margin and they also suggest that character-level dependencies can play an important role in word recognition

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

What Can Spontaneous Facial Expression Tell Us?

Author: Yang Songfan
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

Facial expression plays a significant role in human communication. It is considered the single most important cue in the psychology of emotion. Facial expression is taken as a universally understood signal, which triggers a discrete categorical basic emotion, including joy, sadness, fear, surprise, anger, and disgust. Thus, automatic analysis of emotion from images of human facial expression has been an interesting and challenging problem for the past 30 years. Aiming towards the applications of human behavior analysis, human-human interaction and human-computer interaction, this topic has recently drawn even more attention.Automatic analysis of facial expression in a realistic scenario is a much more difficult problem due to that the 2-D imagery of human facial expression consists of rigid head motion and non-rigid muscle motion. We are tasked to solve this "coupled-motion" problem and analyze facial expression in a meaningful manner. We first proposed an image-based representation, Emotion Avatar Image, to help person-independent expression recognition. Second, an real-time registration technique is designed to improve frame-based streaming action unit (AU) recognition. The proposed accurate expression recognition techniques are then applied to the field of advertising, where audiences' commercial watching behavior is thoroughly analyzed

Ezid

eScholarship - University of California

A dense flow-based framework for real-time object registration under compound motion

Author: Yang Songfan,
Publication venue
Publication date: 18/02/2020
Field of study

Ezid

Reference-based person re-identification

Author: Bir Bhanu
Le An
Mehran Kafai
Songfan Yang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Person re-identification refers to recognizing people across non-overlapping cameras at different times and lo-cations. Due to the variations in pose, illumination con-dition, background, and occlusion, person re-identification is inherently difficult. In this paper, we propose a reference-based method for across camera person re-identification. In the training, we learn a subspace in which the correlations of the reference data from different cameras are maximized using Regularized Canonical Correlation Analysis (RCCA). For re-identification, the gallery data and the probe data are projected into the RCCA subspace and the reference de-scriptors (RDs) of the gallery and probe are constructed by measuring the similarity between them and the reference data. The identity of the probe is determined by comparing the RD of the probe and the RDs of the gallery. Experiments on benchmark dataset show that the proposed method out-performs the state-of-the-art approaches. 1

CiteSeerX

Crossref